Identification of genes and haplotypes that predict rheumatoid arthritis using random forests
نویسندگان
چکیده
Random forest (RF) analysis of genetic data does not require specification of the mode of inheritance, and provides measures of variable importance that incorporate interaction effects. In this paper we describe RF-based approaches for assessment of gene and haplotype importance, and apply these approaches to a subset of the North American Rheumatoid Arthritis Consortium case-control data provided by Genetic Analysis Workshop 16. The RF analyses of 37 genes identified many of the same genes as logistic regression, but also suggested importance of certain single-nucleotide polymorphism and genes that were not ranked highly by logistic regression. A new permutation method did not reveal strong evidence of gene-gene interaction effects in these data. Although RFs are a promising approach for genetic data analysis, extensions beyond simple single-nucleotide polymorphism analyses and modifications to improve computational feasibility are needed.
منابع مشابه
Classification of rheumatoid arthritis status with candidate gene and genome-wide single-nucleotide polymorphisms using random forests
Using the North American Rheumatoid Arthritis Consortium (NARAC) candidate gene and genome-wide single-nucleotide polymorphism (SNP) data sets, we applied regression methods and tree-based random forests to identify genetic associations with rheumatoid arthritis (RA) and to predict RA disease status. Several genes were consistently identified as weakly associated with RA without a significant i...
متن کاملRelationship between Pain-Related Beliefs and Pain Anxiety with Depression in Patients with Rheumatoid Arthritis
Aims and background: Chronic pain has many emotional and psychological pressures and is a complex psychological experience. The purpose of the study was to investigate the relationship of pain-related beliefs and pain anxiety with depression in patients with rheumatoid arthritis. Materials and methods: It was a descriptive-correlational study thus, among patients with rheumatoid arthritis in E...
متن کاملDetecting significant single-nucleotide polymorphisms in a rheumatoid arthritis study using random forests
Random forest is an efficient approach for investigating not only the effects of individual markers on a trait but also the effect of the interactions among the markers in genetic association studies. This approach is especially appealing for the analysis of genome-wide data, such as those obtained from gene expression/single-nucleotide polymorphism (SNP) array experiments in which the number o...
متن کاملIdentifying rheumatoid arthritis susceptibility genes using high-dimensional methods
Although several genes (including a strong effect in the human leukocyte antigen (HLA) region) and some environmental factors have been implicated to cause susceptibility to rheumatoid arthritis (RA), the etiology of the disease is not completely understood. The ability to screen the entire genome for association to complex diseases has great potential for identifying gene effects. However, the...
متن کاملDifferential Expression of Rheumatoid Factor-Associated Cross-Reactive Idiotypes in Iranian Seropositive and Seronegative Patients with Rheumatoid Arthritis
High levels of rheumatoid factors (RF) are detectable in serum of the majority of patients with rheumatoid arthritis (RA), but 5-10% of patients remain seronegative (SN). Despite clinical and genetic similarities between these two subsets of RA, it has been proposed that they may be regarded as distinct clinical entities. Methods: In the present study a panel of monoclonal antibodies (mAb) rec...
متن کامل